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[57] ABSTRACT 

A program interpreter for computer programs written in a 
byte code language, which uses a restricted set of data type 
specific bytccodes. The interpreter, prior to executing any 
byte code program, executes a bytecode program verifier 
procedure mat verifies the integrity of a specified program 
by identifying any bytecode instruction that would process 
data of the wrong type for such a bytecode and any bytecode 
instruction sequences in the specified program that would 
cause underflow or overflow of the operand stack. If the 
program verifier finds any instructions that violate pre- 
defined stack usage and data type usage restrictions, execu- 
tion of the program by the interpreter is prevented. After 
pre-processing of the program by the verifier, if no program 
faults were found, the interpreter executes the program 
without performing operand stack overflow and underflow 
checks and without performing data type checks on oper- 
ands stored in operand stack. As a result program execution 
speed is greatly improved. 

18 Claims, 11 Drawing Sheets 
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BYTECODE PROGRAM INTERPRETER 
APPARATUS AND METHOD WITH PRE- 
VERIFICATION OF DATA TYPE 
RESTRICTIONS AND OBJECT 
INITIALIZATION 

This application is a continuation-in-part of U.S. appli- 
cation Scr. No. 08/360202, filed Dec. 20. 1994. 

The present invention relates generally to the use of 
computer software on multiple computer platforms which 
use distinct underlying machine instruction sets, and more 
specifically to an program verifier and method that verify the 
integrity of computer software obtained from a network 
server or other source. 

BACKGROUND OF THE INVENTION 

Referring to FIG. 1, in a networked computer system 100, 
a first computer 102 may download a computer program 103 
residing on a second computer 104. In this example, the first 
user node 102 will typically be a user workstation (often 
called a client) having a central processing unit 106. a user 
interface 108, memory 110 (e.g., random access memory 
and disk memory) for storing an operating system 112, 
programs, documents and other data, and a communications 
interface 114 for connecting to a computer network 120 such 
as the Internet, a local area network or a wide area network. 
The computers 102 and 104 are often called "nodes on the 
network" or "network nodes." 

The second computer 104 will often be a network server, 
but may be a second user workstation, and typically would 
contain die same basic array of computer components as the 
first computer. 

In the prior art (unlike the system shown in FIG. 1 ). after 
the first computer 102 downloads a copy of a computer 
program 103 from the second computer 104, there are 
essentially no standardized tools available to help the user of 
the first computer 102 to verify the integrity of the down- 
loaded program 103. In particular, unless the first computer 
user studies the source code of the downloaded program, it 
is virtually impossible using prior art tools to determine 
whether the downloaded program 103 will underflow or 
overflow its stack, or whether the downloaded program 103 
will violate files and other resources on the user's computer. 

A second issue with regard to downloading computer 
software from one computer to another concerns transferring 
computer software between computer platforms which use 
distinct underlying machine instruction sets. There are some 
prior art examples of platform independent computer pro- 
grams and platform independent computer programming 
languages. What the prior art lacks are reliable and auto- 
mated software verification tools for enabling recipients of 
such software to verify the integrity of transferred platform 
independent computer software obtained from a network 
server or other source. 

SUMMARY OF THE INVENTION 

The present invention verifies the integrity of computer 
programs written in a bytecode language, commercialized as 
the JAVA bytecode language, which uses a restricted set of 
data type specific byte codes. All the available source code 
bytecodes in the language either (A) are stack data consum- 
ing bytecodes that have associated data type restrictions as 
to the types of data that can be processed by each such 
bytecode. (B) do not utilize stack data but affect the stack by 
either adding data of known data type to the stack or by 
removing data from the stack without regard to data type, or 
(C) neither use stack data nor add data to the stack. 
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The present invention provides a verifier tool and method 
for identifying, prior to execution of a bytecode program, 
any instruction sequence that attempts to process data of the 
wrong type for such a bytecode or if the execution of any 
5 bytecode instructions in the specified program would cause 
underflow or overflow of the operand stack, and to prevent 
the use of such a program. 

The bytecode program verifier of the present invention 
includes a virtual operand stack for temporarily storing stack 
10 information indicative of data stored in a program operand 
stack during the actual execution a specified bytecode pro- 
gram. The verifier processes the specified program using 
data flow analysis, processing each bytecode instruction of 
the program whose stack and register input status map is 
13 affected by another instruction processed by the verifier. A 
stack and register input status map is generated for every 
analyzed bytecode instruction, and when an instruction is a 
successor to multiple other instructions, its status map is 
generated by merging the status maps created during the 
20 processing of each of the predecessor instructions. The 
verifier also compares the stack and register status map 
information with data type restrictions associated with each 
bytecode instruction so as to determine if the operand stack 
or registers during program execution would contain data 
25 inconsistent with the data type restrictions of the bytecode 
instruction, and also determines if any bytecode instructions 
in the specified program would cause underflow or overflow 
of the operand stack. 
The merger of stack and register status maps requires 
30 special handling for the instructions associated with excep- 
tion handlers and the instructions associated with subroutine 
calls (including 'finally" instruction blocks that are executed 
via a subroutine call whenever a protected code block is 
exited). 

After pre-processing of the program by the verifier, if no 
program faults were found, a bytecode program interpreter 
executes the program without performing operand stack 
overflow and underflow checks and without performing data 
^ type checks on operands stored in operand stack. As a result 
program execution speed is greatly improved. 

BRIEF DESCRIPTION OF THE DRAWINGS 

The accompanying drawings, which are- incorporated in 
45 and form a part of this specification, illustrate embodiments 
of the invention and, together with the description, serve to 
explain the principles of the invention, wherein: 

FIG. 1 is a block diagram of a computer system incor- 
porating a preferred embodiment of the present invention. 
50 FIG. 2 is a block diagram of the data structure for an 
object in a preferred embodiment of the present invention. 

FIG. 3 is a block diagram of the data structures main- 
tained by a bytecode verifier during verification of a byte- 
J5 code program in accordance with the present invention. 
FIGS. 4A-4G represents flow charts of the bytecode 
program verification process in the preferred embodiment of 
the present invention. 
FIG. 5 represents a flow chart of the class loader and 
60 bytecode program interpreter process in the preferred 
embodiment of the present invention. 

DETAILED DESCRIPTION OF THE 
PREFERRED EMBODIMENTS 

65 Reference will now be made in detail to the preferred 
embodiments of the invention, examples of which are illus- 
trated in the accompanying drawings. While the invention 
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will be described io conjunction with the preferred computer, such as one of the servers 104 shown in FIG. 1. 

embodiments, it will be understood that they are not If all the methods in the loaded object class are successfully 

intended to limit the invention to those embodiments. On the verified, an object instance of the object class is generated 

contrary. Che invention is intended to cover alternatives. and the bytecode interpreter 122 is invoked to execute the 

modifications and equivalents, which may be included 5 called object method, 

within the spirit and scope of the invention as defined by the FIG. 2 shows the data structure 200 in a preferred 

appended claims. embodiment of the present invention for an object A-01 of 

Referring now to a distributed computer system 100 as class A. An object of object class A has an object handle 202 

shown in FIG. 1. there is shown a distributed computer that includes a pointer 204 to the methods for the object and 

system 100 having multiple client computers 102 and mul- i° a pointer 206 to a data array 208 for the object 

rjple server computers 104. In the preferred embodiment. The pointer 204 to the object's methods is actually an 

each client computer 102 is connected to the servers 104 via indirect pointer to the methods of the associated object class, 

the Internet 120, although other types of communicatioD More particularly, the method pointer 204 points to the 

connections could be used While most client computers are Virtual Function Table (VFT) 210 for the object's object 

desktop computers, such as Sun workstations, IBM compat- 15 class. Each object class has a VFT 210 that includes (A) 

ible computers and Macintosh computers, virtually any type pointers 212 to each of the methods 214 of the object class, 

of computer can be a client computer. In the preferred (B) one or more pointers 215 to methods 216 associated with 

embodiment, each client computer includes a CPU 106. a superclasses of class A, and (Q a pointer 217 to a special 

user interface 108. memory 110, and a communications Class Object 218. 

interface 114. Memory 110 stores: Referring to FIGS. 1 and 2, in the preferred ernrxjdiment 

an operating system 112; methods in an object class to be loaded are bytecode 

, . . - - , programs, which when interpreted will result in a series of 

an Internet communications manager pro-am 116; kstructi ons. A Sting of all the source code 

a b ytecode . pro g ra rn ve rifi er 120 for verifying wheth er or bytecode instructions in the JAVA instruction set is provided 

n ot a specified program satisfies certain predefi ned M ifl TaWc x Thc JAVA instruction set is character- 

lntepitY criteria; ized by bytecode instructions that are data type specific. 

a bytecode program interpreter 122 for executing appli- Specifically, the JAVA instruction set distinguishes the same 

cation programs; basic operation on different primitive data types by desig- 

a riflgf! iftflH/r 174 which loads ob ject classes into a us er's nating separate opcodes. Accordingly, a plurality of byte- 

flfMrpss spare and utilize^ the hy tecode progra5 -ycri- 30 codes are included within instruction set to perform the same 

fier jo verify the integrity of the methods associa ted basic function (for example to add two numbers), with each 

w ith each loaded object cla ss; such bytecode being used to process only data of a corre- 

at least one class repository 126, for locally storing object sponding distinct data type. In addition, the JAVA instruction 

classes 128 in use and/or available for use by user's of set is notable for instructions not included. For instance, 

the computer 102; 33 there are no instructions in the JAVA bytecode language for 

at least one object repository 130 for storing objects 132, converting numbers into object references. These restric- 

which are instances of objects of the object classes tions on the JAVA bytecode instruction set help to ensure 

stored in the object repository 126. that any bytecode program which utilizes data in a manner 

In the preferred embodiment the operating system 112 is an consistent with the data type specific instructions in the 

object oriented multitasking operating system that supports 40 JAVA instruction set will not violate the integrity of a user's 

multiple threads of execution within each defined address computer system. 

space. In the preferred embodiment, the available data types are 

Th e bytecode program verifier 120 includes a snaps hot integer, long integer, single precision floating point, double 

array 140. a fita tP 1 * array 1ST and other data structures that precision floating point, handles (sometimes herein called 

wilLbe described in more detail below. 45 objects or object references), and return addresses (pointers 

The class loader 124 is typically invoied when a user first to virtual machine code). Additional data types are arrays of 

initiates execution of a procedure, requiring that an object of integers, arrays of long integers, arrays of single precision 

the appropriate object class be generated. The class loader floating point numbers, arrays of double precision floating 

124 loads in the appropriate object class and calls the point numbers, arrays of handles, arrays of booleans. arrays 

bytecode program verifier 120 to verify the integrity of all 50 of bytes (8-bit integers), arrays of short integers (16 bit 

the bytecode programs in the loaded object class. If all the signed integer), and arrays of Unicode characters, 

methods arc successfully verified an object instance of the The 'Tiandle" data type includes a virtually unlimited 

object class is generated, and the bytecode interpreter 122 is number of data subtypes because each handle data type 

invoked to execute the user requested procedure, which is includes an object class specification as part of the data type, 

typically called a method. If the procedure requested by the 55 In addition, constants used in programs are also data typed, 

user is not a bytecode program and if execution of the with the available constant data types in the preferred 

non-bytecode program is allowed (which is outside the embodiment comprising the data types mentioned above, 

scope of the present document), the program is executed by plus class, fieldrcf. mcthodref . string, and Asciz, all of which 

a compiled program executer (not shown). represent two or more bytes having a specific purpose. 

The class loader is also invoked whenever an executing 60 The few byte codes that are data type independent perform 

bytecode program encounters a call to an object method for stack manipulation functions such as (A) duplicating one or 

an object class that has not yet been loaded into the user's more words on the stack and placing them at specific 

address space. Once again the class loader 124 loads in the locations within the stack, thereby producing more stack 

appropriate object class and calls the bytecode program items of known data type, or (B) clearing one or more items 

verifier 120 to verify the integrity of all the bytecode 65 from the stack. A few other data type independent bytecodes 

programs in the loaded object class. In many situations the do not utilize any words on the stack and leave the stack 

object class will be loaded from a remotely located unchanged, or add words to the stack without utilizing any 
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of the words previously on the stack These bytecodes do not 
have any data type restrictions with regard to the stack 
contents prior to their execution, and all but a few modify the 
stack's contents and thus affect the program verification 
process. 

The second computer node 104. assumed here to be 
configured as a file or other information server, includes a 
central processing unit 150, a user interface 156, memory 
154. and a other communication interface 158 that connects 
the second computer node to the computer communication 
network 120. M emory 154 stores programs 103, 164, 166 
f or execution by the processor 150 and/or distribution to 
o ther computer nod es. 

The first and second computer nodes 102 and 104 may 
utilize different computer platforms and operating systems 
112, 160 such that object code programs executed on either 
one of the two computer nodes cannot be executed on the 
other. For instance, the server node 104 might be a Sun 
Microsystems computer using a Unix operating system 
while the user workstation node 102 may be an IBM 
compatible computer using an 80486 microprocessor and a 
Microsoft DOS operating system. Furthermore, other user 
workstations coupled to the same network and utilizing the 
same server 104 might use a variety of different computer 
platforms and a variety of operating systems. 

In the past, a server 104 used for distributing software on 
a network having computers of many types would store 
distinct libraries of software for each of the distinct com- 
puter platform types (e.g., Unix, Windows, DOS. 
Macintosh, etc.). Thus, different versions of the same com- 
puter program might be stored in each of the libraries. 
However, using the present invention, many computer pro- 
grams could be distributed by such a server using just a 
single, bytecode version of the program, 

T he bvtecoffe verifi er 120 i c qn eTmitahlp pro ptam which 
v erifies op m^d data type mmpatfrjiifY and proper sta ck 
manipulations in a specified bvtecode (source) program 214 
p rior to the execution of the bytecodc program by th e 
pro cessor 106 unde r tho contro l of the bvtecode interprete r 
122^ Each bytecode program has an associated verification 
status value that is True if the program's integrity is verified 
by the bytecode verifier 120. and it otherwise set to False. 

During normal execution of programs using languages 
other than the Java bytecode language, the interpreter must 
continually monitor the operand stack for overflows (Le., 
adding more data to the stack than the stack can store) and 
underflows (i.e., attempting to pop data off the stack when 
the stack is empty). Such stack monitoring must normally be 
performed for all instructions that change the stack's status 
(which includes most all instructions). For many programs, 
stack monitoring instructions executed by the interpreter 
account for approximately 80% of the execution time of an 
interpreted computed program. 

For many purposes, particularly the integrity of down- 
loaded computer programs, the Internet is a "hostile envi- 
ronment. n A downloaded bytecode program may contain 
errors involving the data types of operands not matching the 
data type restrictions of the instructions using those 
operands, which may cause the program to be fail during 
execution. Even worse, a bytecode program might attempt to 
create object references (e.g.. by loading a computed number 
into the operand stack and then attempting to use the 
computed number as an object handle) and to thereby breach 
the security and/or integrity of the user's computer. 

Use of the bytecode verifier 120 in accordance with the 
present invention enables verification of a bytecode pro- 
gram's integrity and allows the use of an interpreter 122 
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which does not execute the usual stack monitoring instruc- 
tions during program execution, thereby greatly accelerating 
the program interpretation process. 

5 The Bytecode Program Veri fier 

Re ferring now to FIG. 3, the bytecodc_ progiam verifier 
1 20 (often «*u «t ^"fi^") » fpw temporary data 
gfnirtiirre tn gtnpp information i tneeds while_vc rifying a 
s pecified bytecodc prog ram 300 - In particular, the veri fier 

10 1 20 uses a set of data structures 142 for representing cu rrent 
st ack andj sgj^M- status information, and a snapshot da ta 
iftnipt^e 140 f™* r/ *p f wilting thf c^a tus of the virtual sta ck 
a nd registers just prior to the execution of each instruction 
in the prnpram ty ping yerififid. Th* current status data 

13 structures 142 include: a stack size indicator, herein called 
the stack counter 301. a virtual stack 302 that indicates the 
data types of all items in the virtual operand stack, a virtual 
register array 304 that indicates the data types of all items in 
the virtual registers, and a "jsr" bit vector array 306 that 

20 stores zero or more bit vectors associated with the zero or 
more subroutine calls required to reach the instruction 
currently being processed. 

The stack counter 301, which indicates the number of 
stack elements that are currently in use (i.e., at the point in 

25 the method associated with the instruction currently being 
analyzed), is updated by the verifier 120 as it keeps track of 
the virtual stack manipulations so as to reflect the current 
number of virtual stack entries. 

^ The virtual stack 302 stores data type information regard- 
ing each datum that will be stored by the bytecode program 
300 in the virtual operand stack during actual execution of 
the program. In the preferred embodiment, the virtual stack 
302 is used in the same way as a regular stack, except that 

35 instead of storing actual data and constants, the virtual stack 
302 stores a data type indicator value for each datum that 
will be stored in the operand stack during actual execution 
of the program. Thus, for instance, if during actual execution 
the stack were to store three values: 

HaaileTbObjectA 

5 ** 
1 



the corresponding virtual stack entries will be 



R;Glas5 A;initialized 
I 



where "R" in the virtual stack indicates an object reference, 
"Class A" indicates that class or type of the referenced object 
is "A", "initialized* indicates that the referenced object is an 

55 initialized object, and each "1" in the virtual stack indicates 
an integer. Furthermore, the stack counter 301 in this 
example would store a value of 3, corresponding to three 
values being stored in the virtual stack 302. 
Data of each possible data type is assigned a correspood- 

60 ing virtual stack marker value, for instance: integer (I), long 
integer (L), single precision floating point number (F), 
double precision floatingpoint Dumber (D), byte (B). short 
(S), and object reference (R). The marker value for an object 
reference includes a value (e.g., "Class A") indicating the 

65 object type and a flag indicating if the object has been 
initialized. If this is an object that has been created by the 
current method, but has not yet been initialized, the marker 
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value for the object reference also indicates the program detail. Table 2 lists a pseudocode representation of the 

location of the instruction that created the object instance verifier program. The pseudocode used in Table 2 utilizes 

being referenced. universal computer language conventions. While the 

The virtual register array 304 serves the same basic pseudocode employed here has been invented solely for the 

function as the virtual stack 302. That is, it is used to store 5 purposes of this description, it is designed to be easily 

data type information for registers used by the specified understandable by any computer programmer skilled in the 

bytecode program. Since data is often transferred by pro- , , w ^ A t , 

grams between registers and the operand stack, the bytecode ^ sho u wn <»™- 4 U A - a ? d 1 CC ^f c s ft ^ c onc 

instructions performing such datTtransfers and otherwise OT ™ re fgf"* 6 mcthod . s 18 ^ ed bytecode 

using registeTcan be checked to ensure that the data values to ™*« t^^l^^J! 

• , , ■ • . number of non-byte code based tests (352) on the loaded 

in the registers accessed by each bytecode instruction are ^ y^y^ 

consistent with the data type usage restrictions on those ^ format- 

bytecode instructions. . * , , ^ .,. . 

The structure and use of the jsr bit vector array 306 will ** da " » .«* » of • ^ A . 

be described below in the discussion of the handling of 15 *»' no a tte ***** ovemdes a * nal method 1D 

subroutine jumps and returns. a superclass; 

While processing the specified bytecode program, for ** man • has 8 superclass; and 

each datum that would be popped off the stack for process- ** /■* class reference, field reference and method 

ing by a bytecode instruction: foe verifier pops off the same reference ™ *» constant P 001 has a * name - class 

number of data type values off the virtual stack 302 and 20 and type signature 

compares the data type values with the data type require- If any of these initial verification tests fail an appropriate 

roents of the bytecode. For each datum that would be poshed error message is displayedor printed, and the verification 

onto the stack by a bytecode instruction, the verifier pushes procedure exits with an abort return code (354). 

onto the virtual stack a corresponding data type value. fc ^J^^ ^ocedure checks to see if all 

One aspect of program verification in accordance with 25 l * te « te J™" been verified (356). ff so. the 

present invention is verification that the number of the ? K T^^^1!!^^^ ( ^"?^^ 

operands in the virtual stack 302 is identical every time a * a next byUxvk metoodm the loaded object class 

particular instruction is executed, and that the data types of "Jlf « V ^ fica ^ > " <?**>•„ , „ . 

operands in the virtual stack are compatible. If a particular J* code for each method u,clu4es foUowui 8 lQfor " 

bytecode instruction can be immediately preceded in execu- 30 m on * 

tion by two or more different instructions, then the status of * c maximumjt^ by the method; 

the virtual stack immediately after processing of each of toe maximum number of registers used by the method; 

those different predecessor instructions must be compared. the actual bytecodes for executing the method; 

Usually, at least one of the different preceding instructions a table of exception handlers. 

will be a conditional or unconditional jump or branch 35 Each entry in the exception handlers tables gives a start 

instruction. A corollary of the above "stack consistency" and end offset into the bytecodes, an exception type, and the 

requirement is that each program loop must not result in a offset of a handler for the exception. The entry indicates that 

net addition or reduction in the number of operands stored if an exception of the indicated type occurs within the code 

in the operand stack. indicated by the starting and ending offsets, a handler for the 

The stack snapshot array 140 is used to store "snapshots" 40 exception will be found at the given handler offset 

of the stack counter 301. virtual stack 302, virtual register After selecting a method to verify, the verifier initializes 

array 304 and jsr bit vector array 306. A separate snapshot a number of data structures (362). including the stack 

310 is stored for every instruction in the bytecode program counter 301, virtual stack 302, virtual register array 304, jsr 

Each stored stack snapshot includes a "changed" flag 320, a bit vector array 306, and the snapshot array 140. The 

stack counter 321, a stack status array 322, a register status 45 snapshot array is initialized as follows. The snapshot for the 

array 324 and a variable length jsr bit vector array 326. The first instruction of the method is initialized to indicate that 

jsr bit vector array 326 is empty except for instructions that the stack is empty and the registers are empty except for data 

can only be reached via one or more jsr instructions. types indicated by the method's type signature, which indi- 

The changed flag 320 is used to determine which instruc- cates the initial contents of the registers. The snapshots for 

tions require further processing by the verifier, as will be so all other instructions are initialized to indicate that the 

explained below. The stack counter 321, stack status array instruction has not yet been visited. 

322, register status array 324. and jsr bit vector array 326 are In addition, the "changed" bit for the first instruction of 

based on the values stored in the data structures 301*302. the program is seU and a flag caUed VcrilicationSuccess h set 

304 and 306 at corresponding points in the verification to True (364). If the VeriffcationSuccess flag is still set to 

process. 55 True when the verification procedure is finished (368). that 

The snapshot storage structure 140 furthermore stores indicates that the Integrity of the method has been verified, 
instruction addresses 328 (e.g., the absolute or relative If the VerificationSuccess flag is set to False when the 
address of each target instruction). Instruction addresses 328 verification procedure is finished, the method* s integrity has 
are used by the verifier to make sure that no jump or branch not been verified, and therefore an error message is dis- 
insection has a target that falls in the middle of a bytecode 60 played or printed, and the verification procedure exits with 
instruction. an abort return code (354). 

As was described previously, the bytecode program 300 After these initial steps, a data flow analysis is performed 

includes a plurality of data type specific instructions, each of on the selected method (366). The details of the data flow 

which is evaluated by the verifier 120 of the present inven- analysis, which forms the main part of the verification 

tion. 65 procedure, is discussed below with reference to FIG. 4B. 

Referring now to FIGS. 4A-4F. and Table 2. the execution In summary, the verification procedure processes each 

of the bytecode verifier program 120 will be described in method of the loaded class file until either all the bytecode 
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methods arc successfully verified, or the verification of any which case an error signal or message is generated (474) 

one of the methods fails. identifying the place in the program that the stack underflow 

was detected. In addition, the verifier will then set the 
Data Row Analysis of Method VerificadonSuccess flag to False and abort (476) the verifi- 
Referring to FIG. 4B and the corresponding portion of 5 cation process. 
Table 2, the data flow analysis of the selected method is If no stack overflow condition is detected, the verifier will 
completed (382) when there are no instructions whose add (478) an entry to the virtual stack indicating the type of 
changed bit is set (380). Detection of any stack or register data (operand) which is to be pushed onto the operand stack 
usage error during the analysis causes the VerificationSuc- (during the actual execution of the program) for each datum 
cess flag to set to False and for the analysis to be stopped io to be pushed onto the stack by the currently selected 
(382). instruction. This information is derived from the data type 
If mere is at least one instruction whose changed bit is set specific opcodes utilized in the bytecode program of the 
(380), the procedure selects a next instruction whose preferred embodiment of the present invention, the prior 
changed bit is set (384). Any instruction whose changed bit contents of the stack and the prior contents of the registers, 
is set can be selected. 13 Th e verifier also updates the stack counter 301 to reflect die 
The analysis of the selected instruction begins with the addcd cnti y or entries in the virtual stack 302. This corn- 
pre-existing snapshot for the selected instruction being pletes the stack push verification process, 
loaded into the stock counter, virtual stack and the virtual Referring to FIG. 4E, if the currently selected instruction 
register array, and jsr bit vector array, respectively (386). In reads data from a register (510), the verifier will compare 
addition, the changed bit for the selected instruction is 20 (512) the da ta type code information previously stored in the 
turned off (386) corresponding virtual register with the data type require- 
Next, the effect of the selected instruction on the stack and ™ ms W «V) * the c ™*fy ***** instruction. For 
registers is emulated (388). More particularly, four types of object handles, data type checking takes into account object 
"actions" performed by bytecode instructions are emulated „ inheritance (i.e., a method that operates on an object of 
and checked for integrity: stack pops, stack pushes, reading 25 a spumed class will can also operate on an object of any 
datafromregistersand writing^ subcIass of the specified class). If a mismatch rs detected 
steps of this emulation process are described next with < 512 > bctwcCD ^ ^ formation stored in the virtual 
reference to FIGS. 4C-4G. register and the data type requirements of the currently 
n x . ^ _ " A „ t * \ | , • _ . . selected instruction, then a message is generated (514) 

°^^ S f rT°^ n 30 identifying the place in the bytecode program where the 

from fce stack (450) the stack counter 301 „ inspected ^^^occJcd. ^ vcrif J r will ^ * t ^ Ve rifica- 

(452) to delete whether there u suflment data m the to False and ^ (516) ^ verification 

stack to satisfy the data pop requirements of the instruction. ° 

If the operand stack has insufficient data (452) for the current process. 

mstruction, that is called a stack underflow, in which case an 35 ™* ***** ^° t0 scc tf * c accessed by 

error signal or message is generated (454) identifying the select ^ ">stru<*°« has a register number 

place in the program that the stack underflow was detected. ^I^™ "P 5 ** DUmbcr for mc 

In addition, the verifier will then set a VerificatiooSuccess be"»g verified (518). If so, a message is generated (514) 

flag to False and abort (456) the verification process. If no identifying the place in me bytecode program where the 

stack underflow condition is detected, the verifier will com- „ acc ff error occurred The verifier will then set toe 

pare (458) the data type code information previously stored VenficauonSuccess flag to False and abort (516) the venfi- 

in the virtual stack with the data type requirements (if any) cation process. 

of the currently selected instruction, For example, if the # the currently selected instruction does not read data 

opcode of the instruction being analyzed calls for an integer from a register (510) or the data type comparison at step 512 

add of a value popped from the stack, the verifier will 45 results in a match and the registered accessed is within the 

compare the operand information of the item in the virtual range of register numbers used by the method being verified 

stack which is being popped to make sure that is of the (518). then the verifier continues processing the currently 

proper data type, namely integer. If the comparison results in selected instruction at step 520. 

a match, then the verifier deletes (460) the information from Referring to FIG. 4F, if the currently selected instruction 

the virtual stack associated with the entry being popped and stores data into a register (520), then the data type associated 

updates the stack counter 301 to reflect the number of entries with the selected bytecode instruction is stored in the virtual 

popped from the virtual stack 302. register (522). 

If a mismatch is detected (458) between the stored oper- The verifier also checks to see if the registers) to be 

and information in the popped entry of the virtual stack 302 written by the currently selected instruction has(have) a 

and the data type requirements of the currently selected 55 register number higher than the maximum register number 

instruction, then a message is generated (462) identifying for the method being verified (523). If so, an error message 

the place in the bytecode program where the mismatch is generated (526) identifying the place in the bytecode 

occurred. The verifier will then set the VerificationSuccess program where the register access error occurred The 

flag to False and abort (456) the verification process. This verifier will then set the VerificationSuccess flag to False and 

completes the stack pop verification process. 60 abort (528) the verification process. 

Referring to FIG. 4D, if the currently selected instruction In addition, the instruction emulation procedure updates 

pushes data onto the stack (470), the stack counter is the jsr bit vector array 306 as follows. The jsr bit vector array 

inspected (472) to determine whether there is sufficient room 306 includes a separate bit vector for each subroutine level, 

in the stack to store the data the selected instruction will Thus, if the current instruction is in a subroutine nested four 

push onto the stack. If the operand stack has insufficient 65 levels deep, there will be four active jsr bit vectors in the 

room to store the data to be pushed onto the stack by the array 306. If the current instruction is in a subroutine that is 

current instruction (472), that is called a stack overflow, in the target of a jsr instruction (i.e., a jump to subroutine 
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instruction), for each subroutine level applicable to the 
current instruction, the corresponding jsr bit vector is 
updated to indicate the registers) accessed or modified by 
the current instruction (540, FIG. 4G). The set of "marked" 
registers in each jsr bit vector can only be increased, not 
decreased, by step 540, 

At this point the procedure for emulating one instruction 
is complete. 

Referring back now to FIG. 4B, if the instruction emu- 
lation resulted in the detection of an error, the verification 
process is halted (582). Otherwise, the next step (390) is to 
determine the selected instruction's set of successor instruc- 
tions. The "successor instructions" are defined to be all 
instructions that might be executed next after the currently 
selected instruction. The set of all successor instructions* 
includes: 

(A) the next instruction in the method, if the current 
instruction is not an unconditional goto, a return, or a 
throw; 

(B) the target of a conditional or unconditional branch; 

(C) all exception handlers for this instruction; and 

(D) when the current instruction is a subroutine return 
instruction, the instructions immediately following all 
jsr's that target the called subroutine. 

It is noted that the last instruction of most exception 
handlers is a "goto"instruction. More generally, the succes- 
sor instruction for the end of an exception handler is simply 
the successor instruction for the last instruction of the 
exception handler. 

As part of the successor instruction determination 
process, the verifier also checks to see if the program can 
simply **fall off" the current instruction (i.e., without having 
a legal next instruction. If so. this is a fatal error and the 
VerificationSuccess flag is set to False and the verification 
procedure is terminated (382). 

Snapshot Merger 

After the successor instruction determination step (390), 
the verifier next merges the current stack counter 301. virtual 
stack 302, virtual register array 304 and jsr bit vector arrays 
304 into the Snapshots of each of the successor instructions 
(392). This merger is performed separately for each succes- 
sor instruction. There are a number of "special" cases 
requiring special handling of the status and snapshot merger 
process. 

For instance, if a successor instruction is an exception 
handler, the Stack Status portion of the Snapshot of the 
successor instruction is defined to contain a single object of 
the exception type indicated by the exception handler infor- 
mation (Le., the stored data type for the first virtual stack 
element indicates the object type of the exception handler), 
and furthermore the stack counter of the Snapshot of the 
successor instruction is set to a value of 1. 

If the SnapShot for a successor instruction indicates that 
it has never before been * Visited" (i.e.. it is empty), the stack 
counter 301, virtual stack 302, virtual register array 304 and 
jsr bit vector array 306 are copied into the SnapShot for the 
successor instruction. 

Otherwise, when the instruction has been visited before, 
the snapshot merger is handled as follows. If the stack 
counter in the Status Array does not match the stack counter 
in the existing SnapShot or the two stacks are not identical 
with regard to data types, except for differently typed object 
handles, the VerificationSuccess flag is set to False and the 
verification process is aborted. Otherwise, the virtual stack 
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302 and virtual register array 304 values are merged into the 
values of the successor instruction's existing Snapshot as 
follows. 

If two corresponding virtual stack elements or two cor- 
5 responding virtual register elements contain different object 
handles, the specified data type for the stack or register 
element in the snapshot is replaced with the closest common 
ancestor (i.e., superclass) of the two handle types. If two 
corresponding virtual register elements contain different 
10 data types (other than handles), the data type for the register 
element in the updated SnapShot is denoted as "unknown" 
(ie., unusable). If two corresponding stack elements contain 
different data types (other than handles), that is flagged by 
the verifier as an error. 
15 However, if the successor instruction is the instruction 
immediately after a "jsr" instruction and the current instruc- 
tion is a "ret" instruction the above rules for merging virtual 
register status information are replaced with the following 
rule: 

1) for any register that the corresponding jsr bit vector 
(Le., the jsr bit vector for the current instruction that 
corresponds to the successor jsr instruction) indicates 
that the subroutine has accessed or modified, update the 
M successor instruction' s virtual register Snapshot data to 
use the data type of the virtual register at the time of the 
return (Le., use the virtual register data type informa- 
tion in the corresponding element of the virtual register 
array 304); 

30 2) for all other registers, update the successor instruction's 
virtual register SnapShot data to use the data type of the 
register at the time of the preceding jsr instruction (i.e., 
copy the virtual register data type information from the 
corresponding element of the virtual register array in 
33 the preceding jsr's instruction's Snapshot). 

The snapshot merger procedure also copies the current jsr 
bit vectors 30* into the SnapShot of the successor instruc- 
tions only to the extent that those successor instructions are 
inside the same subroutines as the current instruction. 
40 Finally, after the merger of the current verification status 
information into the Snapshot of each successor instruction 
has been performed, the changed bit for the successor 
instruction is set only if the merging of the virtual stack and 
register verification status values caused any change to the 
43 successor instruction's Snapshot (394). 

Thus, the analysis of each selected instruction can cause 
the changed bit of one or more other instructions to be set 
The data flow analysis continues until there are no instruc- 
tions whose changed bit is set (380). Due to the fact that 
50 some instructions are the successor instructions for multiple 
other instructions, many instructions may be analyzed two 
or more times by the data flow analysis procedure before the 
data flow analysis of the method is completed. 

s5 Verification Considerations For Exception Handlers 

An exception handler is a routine that protects a specified 
set of program code, called a protected code block. The 
exception handier is executed whenever the applicable 
exception gets thrown during execution of the corresponding 
60 protected code. 

As indicated above, the Stack Status portion of die 
Snapshot for the first instruction of the exception handler 
contains a single object of the exception type indicated by 
the exception handler information (i.e., the stored data type 
65 for the first virtual stack element indicates the object type of 
the exception handler), and further more the stack counter of 
the SnapShot of the instruction is set to a value of 1. 



05/05/2004, EAST Version: 1.4.1 



5,7- 

13 

The virtual register information of the Snapshot fox the 
exception handler's first instruction contains data type val- 
ues only for registers whose use is consistent throughout the 
protected code, and contains "unknown" indicators for all 
other registers used by die protected code. 

Verification Considerations for "Finally" Code 
Blocks 

The following program: 



tty { 

startFaucetO; 

waterLawaO; 
}finally { 

stopFaucet() 

> 



ensures that the faucet is turned ofF, even if an exception 
occurs while starting the faucet or watering the lawn. The 
code inside the bracket after the word ''try" is called the 
protected code. The code inside the brackets after the word 
'finally" is called the cleanup code. The cleanup code is 
guaranteed to be executed, even if the protected code does 
a "return" out of the function, or contains a break or continue 
to code outside the try/finally code, or experiences an 
exception. 

In the Java bytecode language, the "finally" construct is 
implemented using the exception handling facilities, 
together with a "jsr" (jump to subroutine) instruction and 
*reT (return from subroutine) instruction. The cleanup code 
is implemented as a subroutine. When it is called, the top 
item on the stack will be the return address; mis return 
address is saved in a register. A "ret" is placed at the end of 
The cleanup code to return to whatever code called the 
cleanup. 

To implement the 4 Tin ally" feature, a special exception 
handler is set up for the protected code which catches all 
exceptions. This exception handler: (1) saves any exception 
that occurs in a register, (2) executes a "jsr" to the cleanup 
code, and (3) upon return from the cleanup code, re-throws 
the exception. 

If the protected code has a •return" instruction that when 
executed will cause a jump to code outside (he protected 
code, the interpreter performs the following steps to execute 
that instruction: (1) it saves the return value (if any) in a 
register, (2) executes a "jsr" to the cleanup code, and (3) 
upon return from the cleanup code, returns the value saved 
in the register. 

Breaks or continue instructions inside the protected code 
that go outside the protected code are compiled into byte- 
codes that include a "jsr" to the cleanup code before per- 
forming the associated "goto" function. In addition, there 
must be a "jsr" instruction at the end of the protected code. 

The jsr hit vector array and corresponding Snapshot data, 
as discussed above, enable the successful verification of 
bytecode programs that contain "finally" constructs. Due to 
the provision of multiple jsr bit vectors, even multiply- 
nested cleanup code can be verified. 

Verification Considerations for New Object 
Formation and Initialization 

Creating a usable object in the bytecode interpreter is a 
multi-step process. A typical bytecode sequence for creating 
and initializing an object, and leaving it on top of the stack 

is: 
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ocw <JnyClas«> /* allocate uninitialized space •/ 

dup /* duplicate object on the stack •/ 

<ins mictions for pushing arguments onto the stack> 
5 invoke myClass ,<init> /* initialize */ 



The myClass initialization method, myClass, <inir>, sees 
the newly initialized object as its argument in register 0. It 

10 must either call an alternative myClass initialization method 
or call the initialization method of a superclass of the object 
before it is allowed to do anything else with the object 

To prevent the use of uninitialized objects, and to prevent 
objects from being initialized more than once, the bytecode 

15 verifier pushes a special data type on the stack as the result 
of the opcode "new": 

R;ObjClass;uninitialized;creation step 
The instruction number (denoted above as "creationstep") 
needs to be stored as part of the special data type since there 

20 may be multiple instances of a not-yet initialized data type 
in existence at one time. This special data type indicates the 
instruction in which the object was created and the class type 
of the uninitialized object created. When an initialization 
method is called on that object, all occurrences of the special 

25 type on the virtual stack and in the virtual registers (i.e., all 
virtual stack and virtual registers that have the identical data 
type, including the identical object creation instruction) are 
replaced by the appropriate, initialized data type: 
R;ObjOass;initialized 

30 During verification, the special data type for uninitialized 
objects is an illegal data type for any bytecode instruction to 
use, except for a call to an object initialization method for 
the appropriate object class. Thus, the verifier ensures that an 
uninitialized object cannot be used until it is initialized. 

35 Similarly, the initialized object data type is an illegal data 
type for a call to an object initialization method. In mis way 
the verifier ensures that an object is not initialized more than 
once. 

One special check that the verifier must perform during 
40 the data flow analysis is that for every backwards branch, the 
verifier checks that there are no uninitialized objects on the 
stock or in a register. See steps 530, 532, 534, 536 in FIG. 
4F. In addition, there may not be any uninitialized objects in 
a register in code protected by an exception handler or a 
45 finally code block. See steps 524, 526, 528 in FIG. 4F. 
Otherwise, a devious piece of code could fool the verifier 
into thinking it had initialized an object when it had, in fact, 
initialized an object created in a previous pass through the 
loop. For example, an exception handler could be used to 
50 indirectly perform a backwards branch. 

Class Loader and Bytecode Interpreter 

Referring to flow chart in HG. 5 and Table 3. the 
execution of the class loader 124 and bytecode interpreter 

55 122 will be described. Table 3 lists a pseudocode represen- 
tation of the class loader and bytecode interpreter. 

The class loader 124 is typically invoked when a user first 
initiates execution of a procedure, requiring that an object of 
the appropriate object class be generated. The class loader 

6o 124 loads in the appropriate object class file (560) and calls 
the bytecode program verifier 120 to verify the integrity of 
all the bytecode programs in the loaded object class (562). 
If the verifier returns a Verification failure" value (564), the 
attempt to execute the specified bytecode program is aborted 

65 by the class loader (566). 

If all the methods are successfully verified (564) an object 
instance of the object class is generated, and the bytecode 
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interpreter 122 is invoked (570) to execute the user 
requested procedure, which is typically called a method. The 
bytecode interpreter of the present invention does not per- 
form (and does not need to perform) any operand stack 
overflow and underflow checking during program execution 
and also does not perform any data type checking for data 
stored in the operand stack during program execution. These 
conventional stack overflow, underflow and data type check- 
ing operations can be slapped by the present invention 
because the verifier has already verified that errors of these 
types will not be encountered during program execution. 

The program interpreter of the present invention is espe- 
cially efficient for execution of bytecode programs having 
instruction loops that are executed many times, because the 
operand stack checking instructions are executed only once 
for each bytecode in each such instruction loop in the present 
invention. In contrast, during execution of a program by a 
conventional interpreter, the interpreter must continually 
monitor the operand stack for overflows (i.e M adding more 
data to the stack than the stack can store) and underflows 
(i.e.. attempting to pop data off the stack when the stack is 
empty). Such stack monitoring must normally be performed 
for all instructions that change the stack's status (which 
includes most all instructions). For many programs, stack 
monitoring instructions executed by the interpreter account 
for approximately 80% of the execution time of an inter- 
preted computed program. As a result, the interpreter of the 
present invention will often execute programs at two to ten 
times the speed of a conventional program interpreter run- 
ning on the same computer. 

Alternate Embodiments 

The foregoing descriptions of specific embodiments of the 
present invention have been presented for purposes of 
illustration and description. They are not intended to be 
exhaustive or to limit the invention to the precise forms 
disclosed, and obviously many modifications and variations 
are possible in light of the above teaching. The embodiments 
were chosen and described in order to best explain the 
principles of the invention and its practical application, to 
thereby enable others skilled in the art to best utilize the 
invention and various embodiments with various modifica- 
tions as are suited to the particular use contemplated. It is 
intended that the scope of the invention be defined by the 
claims appended hereto and their equivalents. 

TABLE 1 

BYTBCODB5 IN JAVA LANOUAQB 
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TABLE 1 -continued 



INSTRUCTION NAME 


SHORT DESCRIPTION 


□op 


no operation 


aconsL_null 


push null object 


iccost_ml 


push integer constant- 1 


Lconst_0 


push integer constant 0 


iccnsl_l 


push integer constant 1 


iccns*_2 


push integer constant 2 


icaost_5 


push integer constant 3 


icoast_4 


push integer constant 4 


icunst_5 


push integer constant 5 


lcoust_0 


push long 0L 


bonsi_l 


push long 1L 


fixnsL_0 


push float constant 0.0 


fcon5l_l 


push float constant 1.0 


fconst_2 


push float constant 2.0 


dconst_0 


push double float constant 0.0d 


dconst_l 


push double float constant IJOd 


btpush 


push byte-«ized value 



10 



15 



20 



25 



35 



40 



45 



50 



55 



60 



65 



BYIBCODE5 IN JAVA LANGUAGE 



INSTRUCTION NAME 


SHORT DESCRIPTION 


stpush 


push two-byte value 


He 


load & constant from constant table 




(1 byte index) 


ldc_w 


bad a constant from constant table 




(2 byte index) 


kk2_w 


load a 2-word constant .. 


iload 


load local integer variable 


Uoad 


load local long variable 


fload 


load local floating variable 


dload 


load local double variable 


abad 


load local object variable 


ibad_0 


bad local integer variable #0 


ibad_l 


bad local integer variable #1 


ibad_2 


bad local integer variable #2 


iload_3 


bad local integer variable #3 


Uoad_0 


bad local long variable #0 


ttoad__l 


bad local bug variable #1 


Qoad_2 


bad local bng variable Wl 


]load_3 


bad local bug variable #3 


fload_0 


bad local float variable #0 


fioad_l 


bad local float variable #1 


ftoad_2 


bad local float variable #2 


fload_3 


bad local float variable #3 


dtoad_0 


bad lei double float variable #0 


dtoad_l 


bad 1c] double float variable #1 


dtoad_2 


bad lei double float variable #2 


dfoad_3 


bad lei double float variable #3 


aioad^O 


bad local object variable #0 


aload_l 


bad local object variable #1 


aloa<L_2 


bad bcal object variable #2 




bad local object variable #3 


iaload 


bad from amy of integer 


laload 


bad from array of long 


fabad 


bad from array of float 


daload 


load from array of double 


aaload 


bad from array of object 


baload 


bad from array of (signed) bytes 


caload 


bad from array of chars 


saload 


bad from array of (signed) shorts 


istore 


store bcal integer variable 


(store 


store bcal bog variable 


btore 


store local float variable 


dstore 


store bcal double variable 


astore 


store bcal object variable 


istore__0 


store bcal integer variable #0 


istorc_l 


store local integer variable PI 


tstore_2 


store bcal integer variable #2 


istore_3 


store bcal integer variable #3 


lstom_0 


store bcal bog variable *0 


ktore_l 


store bcal bog variable #1 


lstoie_2 


store bcal bog variable #2 


Utore_3 


store bcal bog variable #3 


fetore_0 


store bcal float variable #0 


(store 1 


store bcal float variable #1 


fetore_2 


store local float variable #2 


&rtorc_3 


store bcal float variable ff3 


dstore_0 


store lei double float variable 90 


dstore_l 


store let double do at variable 91 


dstore_2 


store bl double float variable #2 


dstore 3 


store let double float variable #3 


astore_0 


store bcal object variable 40 


astbfe_l 


store bcal object variable #1 


astore_2 


store bcal object variable #2 


as»ore_3 


store bcal object variable #3 


iastore 


store into array of int 


lastore 


store into array of long 


fas tore 


store into array of float 


dastore 


store into array of double float 


oastore 


store into array of object 


beet ore 


store into array of (signed) bytes 


castore 


store into array of chars 


sastore 


store into array of (signed) shorts 


pop 


pop top element 


pop2 


pop top two elements 


dup 


dup top element 
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TABLE 1 -continued TABLE 1 -continued 



BYTECODES IN JAVA LANGUAGE BYTECODES IN JAVA LANGVAGB 



INSTRUCTION NAME 


SHORT DESCRIPTION 




INSTRUCTION NAME SHORT DESCRIPTION 


dup_jtl 


dup top element. Skip one 




U_acmpeq 


compare top two objects of stack 


dup_x2 


dup top element. Skip two 




if acmpne 


compare top two objects of stack 


duj>2 


dup top two elements. 




goto 


unconditional goto 


dup2_jtl 


dup top 2 elements. Skip one 




jsr 


jump subroutine 


dup2 \2 


dup top 2 elements. Skip two 


10 


ret 


return from subroutine 


swap 


swop top two elements of stack. 




tables witch 


goto (case) 


tadd 


integer add 




lookups witch 


goto (case) 


ladl 


long add 




neturn 


return integer from procedure 


fadd 


floating add 




Ireturn 


return long from procedure 


dadd 


double float add 






rctiOTi float fxom pj\j^cdui£ 


isub 


integer subtract 


15 


drcturo 


wfl hi i \ ftrainut iti tin nmf^MiiiTv 


bub 


long subtract 


are turn 


return object from procedure 


fsub 


floating subtract 






leturn (void) from procedure 


dsub 


floating double subtract 




gEtSt&uC 


A«t cf ctir* 4viVt value 
gci muu. imm.t tuwi 


imul 


integer multiply 




putstattc 


assign stalk field value 


haul 


long multiply 




gemcM 


get field value from object 


fmul 


floating multiply 


20 


pumcld 


assign field value to object. 


dnjul 


double float multiply 


invokevirtual 


call method, based on object. 


kliv 


integer divide 




uxvokcoonvirtual 


tali UWIKM) IKJl (MOW VU vpjn.li 


Wiv 


long divide 




invokestatk 


call a static method. 


fdiv 


floating divide 




inrokcint erface 


call an interface method 


ddiv 


double float divide 




new 


Create a new object 


irexn 


in te germed 


25 


new array 


Create a new array of non-objects 


Iran 


long mod 


anewarray 


Create a new array of objects 


&CH1 


floating mod 




array length 


get length of array 


drcm 


double float mod 




athrow 


throw an exception 


ineg 


integer negate 




checkcast 


error if object not of given type 


beg 


long negate 




instanceof 


is object of given type? 


faeg 


floating negate 




momtorenter 


enter a monitored region of code 


dneg 


double float negate 


30 


monitorexit 


exit a monitored region of code 


ishl 


shift left 




wide 


prefix operation. 


IshJ 


long shift left 




muhianew array 


create multidimensional array 


ishr 


shift right 




imull 


goto if null 


iii.il 


long shift right 
unsigned shift right 
long unsigned shift right 
boolean and 


35 


goto_w 
jsr_w 


goto if not null 

uncoriditiocLal goto. 4byte offset 
jump subroutine. 4 byte offset 


long boolean and 




breakpoint 


call breakpoint handler 


JOT 


boolean or 








lor 


long boolean or 








ixor 


boolean xor 








lior 


long boolean xor 

increment lei variable by constant 

integer to long 

integer to float 


Art 
4U 




TABLE 2 


iinc 
121 




Pseudocode for JAVA Bytecode Verifier 


L2f 








i2d 


integer to double 




Receive Object Class Fde with one or more bytecode programs to 


12i 


long to integer 




be verified. 




12f 


long to float 


45 


/* Perform initial checks that do not require inspection of bytecodes */ 


I2d 


long to double 


If file format of the class file is improper 


£2i 


float to integer 




{ 




m 


float to long 




Print appropriate error 


message 


£2d 


float to double 




Return with Abort return code 


d2i 


double to integer 




> 




d21 


double to long 




If (A) any M finaT class has a subclass; 


d2f 


double to float 


50 


(B) the class is a subclass of a "final" class; 


intZbyte 


integer to byte 




(C) any method in the class overrides a "final" method in a 


int2cbar 


integer to character 




superclass; or 




mt2short 


integer to signed short 




(D) any class reference, field reference and method reference in the 


lemp 


long compare 




constant pool doc 


$ not have a legal name, class and type 


fcmpl 


float compare. -1 on incomparable 




signature 




fcmpg 


float compare. 1 on incomparable 


55 


{ 




dcmpl 


dbl floating cmp. -1 on incomp 


Print appropriate error message 


dempg 


dbl flrmfir^g CTTip 1 OO jnrrrp^p 




Return with Abort return code 


ifeq 


goto if equal 




} 




ifxie 


goto if not equal 




For each Bytecode Method in the Class 


iflt 


goto if less than 




{ 




if go 


goto if greater than or equal 


€0 


f 9 Data-flow analysis is performed on each method of the class . 


ifgt 


goto if greater than 


being verified •/ 




ifle 


goto if less than or equal 




If: (A) any branch instruction would branch into the middle of an 


iLJcmpcq 


compare top two elements of stack 




instruction, 




iLJcmpne 


compare top two elements of stack 




(B) any register references access or modify a register having a 


if__icmph 


compare top two elements of stack 




register number higher than the number of registers used by the 


iLJcmpge 


compare top two elements of stack 


65 


method. 




if_Jcmpgt 


compare top two elements of stack 


(C) the method ends in the middle of an instruction, 


i£_icmpk 


compare top two elements of stack 




(D) any instruction having a reference into the constant pool 
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TABLE 2-continued 



TABLE 2-continued 



Pseudocode for JAVA Bytecode Verifier 



Pseudocode for JAVA Bytecode Verifier 



does not match the data type of the referenced constant pool 
item, 

(E) any exception handler does not have properly specified 

starting and ^nrfing points, 

{ 

Print appropriate error message 
Return with Abort return code 
I 

Create: status data structures: stack counter; stack status array, 

register status array, jsr bit vector array 
Create Snapshot array with one Snapshot for every instruction in the 

bytecode pro gram 
Initialize Snapshot for first instruction of program to indicate the 

stack is empty and die registers are empty except for data types 

indicated by the method's type signature (i.e., for arguments 

to be passed to the method) 
Initialize Snapshots for all other instructions to indicate that the 

instruction has not yet been visited 
Set the "changed" bit for the first instruction of the program 
Set Verific ationSuccess to Irue 

Do Until there are no instructions whose changed bit is set 
{ 

Select a next instruction (e.g_, in sequential order m program) 
whose changed bit is set 

Load Snapshot for the selected instruction (snowing status of 
stack and registers prior to execution of the selected 
instruction) into the stack counter, virtual stack and the 
virtual register array, and jsr bit vector array, respectively. 

Turn off the selected instruction's changed bit 

/* Kmulnte the effect of this instruction on the stack and 
registers*/ 

Case(Instruction Type): 
i 

Casc=Instructioo pops data from Operand Stack 
{ 

Pop operand data type information from Virtual Stack 
Update Stack Counter 
If Virtual Stack has Underfiowed 

{ 

Print error message identifying place in program mat 

underflow occurred 
Abort Verification 
Return with abort return code 
> 

Compare data type of each operand popped from virtual 
stack with data type required (if any) by the bytecode 
instruction 

If type mismatch 
{ 

Print message tientifying place in program mat data 

type mismatch occurred 
Set VerificationSuccess to False 
Return with abort return code 



10 



15 



20 



23 



30 



< 

Update Virtual Register Array to indicate changed register's 
new data type 

If instruction places an uninitialized object in a register and 
the instruction is protected by any exception handler 
(including the special exception handler for a "finally" 
code block) 
{ 

Print error message 

Set VerificationSuccess to False 



} 



} 



Case=Backwards Branch 
{ 

If Virtual Stack or Virtual Register Array contain any 
uninitialized object data types 
< 

Print error message 

Set \ferifktttionSuccess to False 



> 



} 



} /* EndCase V 

/* Update jsr bit vector array */ 

If the current instruction is in a subroutine that is the target of a jsx 

{ 

For each level of jsr applicable to the curre n t instruction 
< 

Update corresponding jsr bit vector to indicate register^) 

accessed or modified by the current instruction 
/* Set of "marked" registers can only be increased, not 



} 



/• Update all affected Snapshots and changed bits •/ 
Determine set of all successor instructions, including: 

(A) the next instruction if the current instruction is not an 
unconditional goto, a return, or a throw, 
35 (B) the target of a conditional or uncoil tknal branch, 

(C) all exception handlers for this instruction, 

(D) when the current instruction is a return instruction, the 
successor instructions are the instructions immediately 
following all jsr's that target the called subroutine. 

If the program can "fall off" the last instruction 



40 



{ 

Set VerincarioaSucccss to False 
Return with Abort return code value 
I 



} 



Casc=In5trucuoii pushes data onto Operand Stack 

{ 

Push data type information onto Virtual Stack 
Update stack counter 
If Virtual Stack has Overflowed 
{ 

Print message identifying place in program that 

overflow occurred 
Set VerificationSuccess to False 
Return with abort return code 
} 

} 

Case=mstruction uses data stored in a register 
{ 

If type mismatch 
< 

Print message identifying place in program that data 

type mismatch occurred 
Set VerificationSuccess to False 

> 



/* Merge the stack counter, virtual stack, virtual register array and jsr bit 
vector arrays into the Snapshots of each of the successor 
instructions */ 
43 Do for each successor instruction: 
{ 

If me successor instruction is the first instruction of an exception 



50 



} 

Case=Instructioa modifies a register 



55 



60 



65 



{ 

Change the Stack Status portion of the Snapshot of the 
successor instruction to contain a single object of the 
exception type indicated by the exception handler 
information. 

Set stack counter of the Snapshot of the successor 

instruction to 1. 
Performs steps noted below for successor instruction 

handling only with respect to the virtual register array 

and jsr bit vector array. 

If this is the first time the Snapshot for a successor instruction 
has been visited 

i 

Copy the stack counter, virtual stack, virtual register array 
and jsr bit vector array into (he SnapShot for the 
successor instruction 

Set the changed bit for the successor instruct ton 



Else 



} 



/* the instruction has been visited before */ 



If the stack counter in the Status Array does not match the 
stack counter in the existing Snapshot, or the two 
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Pseudocode for JAVA Bytecode Verifier 



stacks are Dot identical with regard to data types 
(except for differently typed object handles) 
{ 

Set VerificatioaSuccess to Fake 
Return with Abort return code value 

} 

Merge the Virtual Stack and Virtual Register Array values 
into the values of the existing Snapshot: 
{A) if two corresponding stack elements or two 
corresponding register elements contain different 
object handles, replace the specified data type for the 
stack or register element with the closest common 
ancestor of the two handle types; 

(B) if two corresponding register elements contain 
different data types (other than handles), denote the 
specified data type for ihe register element in the new 
Snapshot as "unknown" (i«., unusable); 

(C) follow special merger rules for merging register 
status information when the successor instruction b the 
instruction immediately after a "jsr" instruction and the 
current instruction is a "ret" instruction: 

1) for any register that the bit vector indicate* 
that the subroutine has accessed or modified, 
use the data type of the register at the time of 
the return, and 

2) for other registers, use the data type of the 
register at the time of the preceding jsr 
instruction. 

/* Note that return, break and continue instructions 
inside a code block protected by a "finally" 
exception handler are treated the same as a w isr" 
instruction (for a subroutine call to the "finally" 
exception handler) for verification purposes. */ 

Copy the jsr bit vectors into the Snapshot of the 

successor instructions only to (he extent that those 
successor instructions are inside the same 
subroutines as the current instruction. 

Set the changed bit for each successor instruction for 
which the merging of the stack and register values 
caused any change to the successor instruction's 
Snapshot. 

I 

} /• End of Do Loop for Successor Instructions */ 
} /* End of Do Loop for Instruction Emulation •/ 

} /• End of Loop for Bytecode Methods V 
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Return (VerificaiionSuccess) 
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Pseudocode for Bytecode Class Loader and Interp re ter 

Procedure: ClassLoader (Class, Pgm) 
{ 

If the Class has not already been loaded and verified 
\ 

Receive Class 

Call Bytecode Verifier to verify all bytecode programs in the class 
If Not Verifications uccesa 

{ 

Print or display appropriate error message 
Return 
} 

} 

Interpret and execute Pgm (the specified bytecode program) without 

performing operand stack overflow and underflow checks and without 
performing data type checks on operands stored in operand stack. 

> 



What is claimed is: 

1. A method of operating a computer system, the steps of 
the method comprising: 
(A) storing a program in a memory, the program including 65 
a sequence of instructions, where each of a multiplicity 
of said instructions each represents an operation on data 



of a specific data type; said each instruction having 
associated data type restrictions on the data type of data 
to be manipulated by said each instruction; 

(B) prior to execution of said program, preprocessing said 
program by determining whether execution of any 
instruction in said program would violate said data type 
restrictions for that instruction and generating a pro- 
gram fault signal when execution of any instruction in 
said program would violate the data type restrictions 
for that instruction; 

said preprocessing step including: 

(Bl) storing, for each instruction in said program, a 
data type snapshot said data type snapshot including 
data type information concerning data types associ- 
ated with data stored in an operand stack and regis- 
ters by said program immediately prior to execution 
of the corresponding instruction; 

(B2) emulating operation of a selected instruction in the 
program by: (B2A) analyzing stack and register 
usage by said selected instruction so as to generate a 
current data type usage map for said operand stack 
and registers, (B2B) determining all successor 
instructions to said selected instruction. (B2C) merg- 
ing the current data type usage map with the data 
type snapshot of said determined successor 
instructions, and (B2D) marking for further analysis 
each of said determined successor instructions 
whose data type snapshot is modified by said merg- 
ing; 

(B3) emulating operation of each of said instructions 
marked for further analysis by performing step B2 on 
each of those marked instructions and unmarking 
each said emulated instruction; and 

(B4) repeating step B3 until there are no marked 
instructions; 

said step B2A including detennining when said stack and 
register usage by said instruction would violate said 
data type restrictions for that instruction and generating 
a program fault signal when execution of said instruc- 
tion program would violate said data type restrictions. 

2. The method of claim 1, said step B2 including 
determining whether execution of said selected instruc- 
tion would result in an operand stack underflow or 
overflow, and whether execution of any loop in said 
program would result in a net addition or deletion of 
operands to said operand stack, and for generating a 
program fault signal when said execution of said 
selected instruction would result in an operand stack 
underflow or overflow and when execution of any loop 
in said program would produce a net addition or 
deletion of operands to said operand stack. 

3. The method of claim 1. including 

(C) when said preprocessing of said program results in the 
generation of no program fault signals, enabling execu- 
tion of said program; 

(D) when said preprocessing of said program results in the 
generation of a program fault, preventing execution of 
said program; and 

(E) when execution of said bytecode program has been 
enabled, executing said bytecode program without per- 
forming data type checks on operands stored in said 
operand stack during execution of said bytecode pro- 
gram. 

4. The method of claim 1. 

said program including at least one object creation 
instruction and at least one object initialization 
instruction; 
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said step 2B including storing in said current usage data 
map. for each object that would be stored in said 
operand stack and registers after execution of said 
selected instruction, a data type value for each unini- 
tialized object that is distinct from a corresponding data 5 
type value for the same object after initialization 
thereof; 

said step 2B further including, when said selected instruc- 
tion is not said at least one object initialization 
instruction, generating a program fault signal when 10 
execution of said selected instruction would access a 
stack operand or register whose data type corresponds 
to an uninitialized object. 

5. The method of claim 4. 

said step 2B further including, when said selected instate- 
tion is said at least one object initialization instruction, 
generating a program fault signal when execution of 
said selected instruction would access a stack operand 
or register whose data type corresponds to an initialized 
object, 

6. The method of claim 1, 

said program including at least one jump to subroutine 
(jsr) instruction and at least one subroutine return (ret) 
instruction located within a subroutine included in said 
program; 

said step B2B including, when the current instruction is 
said subroutine return instruction, determining each of 
said successor instructions to be an instruction imme- 
diately following a jsr instruction for jumping to said 
subroutine; 30 

said step B2C including, when the current instruction is 
said subroutine return instruction, merging the current 
data type usage map with the data type snapshot of each 
said determined successor instructions by storing in the 
data type snapshot for each said successor instruction 3J 
data type information from said current data type usage 
map for each register accessed and each register modi- 
fied by said subroutine and data type information for 
each other register from the data type snapshot for the 
jsr instruction immediately preceding said each succes- ^ 
sor instruction. 

7. A computer system, comprising: 

memory for storing a program, the program including a 
sequence of instructions, where each of a multiplicity 
of said instructions each represents an operation on data 43 
of a specific data type; said each instruction having 
associated data type restrictions on the data type of data 
to be manipulated by said each instruction; 
a data processing unit for executing programs stored in 

said memory; 50 
a program verifier, stored in said memory, said program 
verifier including data type testing instructions for 
determining whether execution of any instruction in 
said program would violate said data type restrictions 
for that instruction and generating a program fault 55 
signal when execution of any instruction in said pro- 
gram would violate the data type restrictions for that 
instruction; 
said data type testing instructions including: 
instructions for storing, for each instruction in said 60 
program, a data type snapshot, said data type snap- 
shot including data type information concerning data 
types associated with data stored in an operand stack 
and registers by said program immediately prior to 
execution of the corresponding instruction; 65 
instructions for emulating operation of a selected 
instruction in the program by: analyzing stack and 
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register usage by said selected instruction so as to 
generate a current data type usage map for said 
operand stack and registers, determining all succes- 
sor instructions to said selected instruction, merging 
the current data type usage map with the data type 
snapshot of said determined successor instructions, 
and marking for further analysis each of said deter- 
mined successor instructions whose data type snap- 
shot is modified by said merging; 

instructions for emulating operation of each of said 
instructions marked for further analysis and unmark- 
ing each said emulated instruction; and 

instructions for continuing to emulate operation of any 
instructions marked for further analysis until there 
are no marked instructions; 

said data type testing instructions including instructions 
for determining when said stack and register usage 
by said instruction would violate said data type 
restrictions for that instruction and generating a 
program fault signal when execution of said instruc- 
tion program would violate said data type restric- 
tions. 

8. The computer system of claim 7, including: 
program execution enabling instructions that enable 

execution of said bytecode program only after process- 
ing said bytecode program by said bytecode program 
verifier generates no program fault signals; and 
a bytecode program interpreter, coupled to said bytecode 
program enabling instructions, for executing said byte- 
code program after processing of said bytecode pro- 
gram by said bytecode program verifier and after said 
bytecode program enabling Instructions enable execu- 
tion of said bytecode program by said bytecode pro- 
gram interpreter; said bytecode program interpreter 
including instructions for executing said bytecode pro- 
gram without performing data type checks on operands 
stored in said operand stack. 

9. The computer system of claim 8, 

said data type testing instructions including stack 
overflow/underflow testing instructions for determin- 
ing (A) whether execution of said program would result 
in an operand stack underflow or overflow, and (B) 
whether execution of any loop in said program would 
result in a net addition or deletion of operands to said 
operand stack, and for generating a program fault 
signal when said execution of said selected instruction 
would result in an operand stack underflow or overflow 
and when execution of any loop in said program would 
produce a net addition or deletion of operands to said 
operand stack; and 

said executing instructions of said bytecode program 
interpreter including instructions for executing said 
bytecode program without performing operand stack 
underflow and overflow checks during execution of 
said bytecode program, 

10. The computer system of claim 7, 

said program including at least one object creation 
instruction and at least one object initialization instruc- 
tion; 

said data type testing instructions including instructions 
for storing in said current usage data map. for each 
object that would be stored in said operand stack and 
registers after execution of said selected instruction, a 
data type value for each uninitialized object that is 
distinct from a corresponding data type value for the 
same object after initialization thereof; 
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said data type testing instructions further including 
instructions for generating, when said selected instruc- 
tion is not said at least one object initialization 
instruction, a program fault signal when execution of 
said selected instruction would access a stack operand 5 
or register whose data type corresponds to an unini- 
tialized object 

11. The computer system of claim 10, 

said data type testing instructions further including 
instructions for generating, when said selected instruc- i° 
tion is said at least one object initialization instruction, 
a program fault signal when execution of said selected 
instruction would access a stack operand or register 
whose data type corresponds to an initialized object 

12. The computer system of claim 7, 15 
said program including at least one jump to subroutine 

(jsr) instruction and at least one subroutine return (ret) 
instruction located within a subroutine included in said 
program; 

said data type testing instructions including instructions 
for deterrnining, when the current instruction is said 
subroutine return instruction, each of said successor 
instructions to be an instruction immediately following 
a jsr instruction for jumping to said subroutine; ^ 

said data type testing instructions including instructions 
for merging, when the current instruction is said sub- 
routine return instruction, the current data type usage 
map with the data type snapshot of each said deter- 
mined successor instructions by storing in the data type ^ 
snapshot for each said successor instruction data type 
information from said current data type usage map for 
each register accessed and each register modified by 
said subroutine and data type information for each 
other register from the data type snapshot for the jsr 35 
instruction immediately preceding said each successor 
instruction. 

13. A computer program product for use in conjunction 
with a computer system, the computer program product 
comprising a computer readable storage medium and a ^ 
computer program mechanism embedded therein, the com- 
puter program mechanism comprising: 

a program stored in said memory, the program including 
a sequence of instructions, where each of a multiplicity 
of said instructions each represents an operation on data 45 
of a specific data type; said each instruction having 
associated data type restrictions on the data type of data 
to be manipulated by said each instruction; 
a program verifier, stored in said memory, said program 
verifier including data type testing instructions for 50 
determining whether execution of any instruction In 
said program would violate said data type restrictions 
for that instruction and generating a program fault 
signal when execution of any instruction in said pro- 
gram would violate the data type restrictions for that 55 
instruction; 
said data type testing instructions including: 
instructions for storing, for each instruction in said 
program, a data type snapshot said data type snap- 
shot including data type information concerning data 60 
types associated with data stored in an operand stack 
and registers by said program inimediately prior to 
execution of the corresponding instruction; 
instructions for emulating operation of a selected 
instruction in the program by: analyzing stack and 65 
register usage by said selected instruction so as to 
generate a current data type usage map for said 
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operand stack and registers, detemining all succes- 
sor instructions to said selected instruction, merging 
the current data type usage map with the data type 
snapshot of said determined successor instructions, 
and marking for further analysis each of said deter- 
mined successor instructions whose data type snap- 
shot is modified by said merging; 
instructions for emulating operation of each of said 
instructions marked for further analysis and unmark- 
ing each said emulated instruction; and 
instructions for continuing to emulate operation of any 
instructions marked for further analysis until there 
are no marked instructions; 
said data type testing instructions including instructions 
for determining when said stack and register usage by 
said instruction would violate said data type restrictions 
for mat instruction and generating a program fault 
signal when execution of said instruction program 
would violate said data type restrictions. 

14. The computer program product of claim 13. including: 
program execution enabling instructions that enable 

execution of said bytecode program only after process- 
ing said bytecode program by said bytecode program 
verifier generates no program fault signals; and 
a bytecode program interpreter, coupled to said bytecode 
program enabling instructions, for executing said byte- 
code program after processing of said bytecode pro- 
gram by said bytecode program verifier and after said 
bytecode program enabling instructions enable execu- 
tion of said bytecode program by said bytecode pro- 
gram interpreter; said bytecode program interpreter 
including instructions for executing said bytecode pro- 
gram without performing data type checks on operands 
stored in said operand stack. 

15. The computer program product of claim 14, 

said data type testing instructions including stack 
overflow/underflow testing instructions for determin- 
ing (A) whether execution of said program would result 
in an operand stack underflow or overflow, and (B) 
whether execution of any loop in said program would 
result in a net addition or deletion of operands to said 
operand stack, and for generating a program fault 
signal when said execution of said selected instruction 
would result in an operand stack underflow or overflow 
and when execution of any loop in said program would 
produce a net addition or deletion of operands to said 
operand suck; and 

said executing instructions of said bytecode program 
interpreter including instructions for executing said 
bytecode program without r^rfcrming operand stack 
underflow and overflow checks during execution of 
said bytecode program. 

16. The computer program product of claim 13, 

said program including at least one object creation 
instruction and at least one object initialization instruc- 
tion; 

said data type testing instructions including instructions 
for storing in said current usage data map, far each 
object that would be stored in said operand stack and 
registers after execution of said selected instruction, a 
data type value for each uninitialized object that is 
distinct from a corresponding data type value for the 
same object after initialization thereof; 

said data type testing instructions further including 
instructions for generating, when said selected instruc- 
tion is not said at least one object initialization 
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instruction, a program fault signal when execution of 
said selected instruction would access a stack operand 
or register whose data type corresponds to an unini- 
tialized object, 

17. Hie computer program product of claim 16, 5 

said data type testing instructions further including 
instructions for generating, when said selected instruc- 
tion is said at least one object initialization instruction, 
a program fault signal when execution of said selected 
instruction would access a stack operand or register 10 
whose data type corresponds to an initialized object 

IB. The computer program product of claim 13. 

said program including at least one jump to subroutine 
(jsr) instruction and at least one subroutine return (ret) 
instruction located within a subroutine included in said 15 
program; 

said data type testing instructions including instructions 
for determining, when the current instruction is said 
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subroutine return instruction, each of said successor 
instructions to be an instruction immediately following 
a jsr instruction for jumping to said subroutine; 
said data type testing instructions including instructions 
for merging, when the current instruction is said sub- 
routine return instruction, the current data type usage 
map with the data type snapshot of each said deter- 
mined successor instructions by storing in the data type 
snapshot for each said successor instruction data type 
information from said current data type usage map for 
each register accessed and each register modified by 
said subroutine and data type information for each 
other register from the data type snapshot for the jsr 
instruction immediately preceding said each successor 
instruction. 

***** 
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